Search CORE

1,674 research outputs found

DeepFactors: Real-time probabilistic dense monocular SLAM

Author: Clark R
Czarnowski J
Davison AJ
Laidlow T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/12/2019
Field of study

The ability to estimate rich geometry and camera motion from monocular imagery is fundamental to future interactive robotics and augmented reality applications. Different approaches have been proposed that vary in scene geometry representation (sparse landmarks, dense maps), the consistency metric used for optimising the multi-view problem, and the use of learned priors. We present a SLAM system that unifies these methods in a probabilistic framework while still maintaining real-time performance. This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors: photometric, reprojection and geometric, which we make use of within standard factor graph software. We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

RIDI: Robust IMU Double Integration

Author: AJ Davison
AJ Smola
C Cadena
CH Lim
J Engel
JA Hesch
R Mur-Artal
S Leutenegger
Publication venue
Publication date: 30/12/2017
Field of study

This paper proposes a novel data-driven approach for inertial navigation, which learns to estimate trajectories of natural human motions just from an inertial measurement unit (IMU) in every smartphone. The key observation is that human motions are repetitive and consist of a few major modes (e.g., standing, walking, or turning). Our algorithm regresses a velocity vector from the history of linear accelerations and angular velocities, then corrects low-frequency bias in the linear accelerations, which are integrated twice to estimate positions. We have acquired training data with ground-truth motions across multiple human subjects and multiple phone placements (e.g., in a bag or a hand). The qualitatively and quantitatively evaluations have demonstrated that our algorithm has surprisingly shown comparable results to full Visual Inertial navigation. To our knowledge, this paper is the first to integrate sophisticated machine learning techniques with inertial navigation, potentially opening up a new line of research in the domain of data-driven inertial navigation. We will publicly share our code and data to facilitate further research

arXiv.org e-Print Archive

Crossref

Unified Inverse Depth Parametrization for Monocular SLAM

Author: Civera J
Davison AJ
Montiel JMM
Publication venue: 'Robotics: Science and Systems Foundation'
Publication date: 16/08/2006
Field of study

Crossref

Spiral - Imperial College Digital Repository

Simultaneous Optical Flow and Intensity Estimation from an Event Camera

Author: Bardow P
Davison AJ
Leutenegger S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/04/2016
Field of study

Event cameras are bio-inspired vision sensors which mimic retinas to measure per-pixel intensity change rather than outputting an actual intensity image. This proposed paradigm shift away from traditional frame cameras offers significant potential advantages: namely avoiding high data rates, dynamic range limitations and motion blur. Unfortunately, however, established computer vision algorithms may not at all be applied directly to event cameras. Methods proposed so far to reconstruct images, estimate optical flow, track a camera and reconstruct a scene come with severe restrictions on the environment or on the motion of the camera, e.g. allowing only rotation. Here, we propose, to the best of our knowledge, the first algorithm to simultaneously recover the motion field and brightness image, while the camera undergoes a generic motion through any scene. Our approach employs minimisation of a cost function that contains the asynchronous event data as well as spatial and temporal regularisation within a sliding window time interval. Our implementation relies on GPU optimisation and runs in near real-time. In a series of examples, we demonstrate the successful operation of our framework, including in situations where conventional cameras suffer from dynamic range limitations and motion blur

Crossref

Spiral - Imperial College Digital Repository

Pairwise Decomposition of Image Sequences for Active Multi-View Recognition

Author: Davison AJ
Johns E
Leutenegger S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/04/2016
Field of study

A multi-view image sequence provides a much richer capacity for object recognition than from a single image. However, most existing solutions to multi-view recognition typically adopt hand-crafted, model-based geometric methods, which do not readily embrace recent trends in deep learning. We propose to bring Convolutional Neural Networks to generic multi-view recognition, by decomposing an image sequence into a set of image pairs, classifying each pair independently, and then learning an object classi- fier by weighting the contribution of each pair. This allows for recognition over arbitrary camera trajectories, without requiring explicit training over the potentially infinite number of camera paths and lengths. Building these pairwise relationships then naturally extends to the next-best-view problem in an active recognition framework. To achieve this, we train a second Convolutional Neural Network to map directly from an observed image to next viewpoint. Finally, we incorporate this into a trajectory optimisation task, whereby the best recognition confidence is sought for a given trajectory length. We present state-of-the-art results in both guided and unguided multi-view recognition on the ModelNet dataset, and show how our method can be used with depth images, greyscale images, or both

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Learning to complete object shapes for object-level mapping in dynamic scenes

Author: Davison AJ
Leutenegger S
Xu B
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2022
Field of study

In this paper, we propose a novel object-level mapping system that can simultaneously segment, track, and reconstruct objects in dynamic scenes. It can further predict and complete their full geometries by conditioning on reconstructions from depth inputs and a category-level shape prior with the aim that completed object geometry leads to better object reconstruction and tracking accuracy. For each incoming RGB-D frame, we perform instance segmentation to detect objects and build data associations between the detection and the existing object maps. A new object map will be created for each unmatched detection. For each matched object, we jointly optimise its pose and latent geometry representations using geometric residual and differential rendering residual towards its shape prior and completed geometry. Our approach shows better tracking and reconstruction performance compared to methods using traditional volumetric mapping or learned shape prior approaches. We evaluate its effectiveness by quantitatively and qualitatively testing it in both synthetic and real-world sequences

Spiral - Imperial College Digital Repository

Marker based Thermal-Inertial Localization for Aerial Robots in Obscurant Filled Environments

Author: A Bircher
AJ Davison
C Brunner
C Papachristos
D Lattanzi
M Fiala
S Garrido-Jurado
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/03/2019
Field of study

For robotic inspection tasks in known environments fiducial markers provide a reliable and low-cost solution for robot localization. However, detection of such markers relies on the quality of RGB camera data, which degrades significantly in the presence of visual obscurants such as fog and smoke. The ability to navigate known environments in the presence of obscurants can be critical for inspection tasks especially, in the aftermath of a disaster. Addressing such a scenario, this work proposes a method for the design of fiducial markers to be used with thermal cameras for the pose estimation of aerial robots. Our low cost markers are designed to work in the long wave infrared spectrum, which is not affected by the presence of obscurants, and can be affixed to any object that has measurable temperature difference with respect to its surroundings. Furthermore, the estimated pose from the fiducial markers is fused with inertial measurements in an extended Kalman filter to remove high frequency noise and error present in the fiducial pose estimates. The proposed markers and the pose estimation method are experimentally evaluated in an obscurant filled environment using an aerial robot carrying a thermal camera.Comment: 10 pages, 5 figures, Published in International Symposium on Visual Computing 201

arXiv.org e-Print Archive

Crossref

MonoSLAM: Real-time single camera SLAM

Author: Davison AJ
Molton ND
Reid ID
Stasse O
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Published versio

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

Bundle adjustment on a graph processor

Author: Davison AJ
Leutenegger S
Ortiz J
Pupilli M
Publication venue: 'Center for Open Science'
Publication date: 30/03/2020
Field of study

Graph processors such as Graphcore's Intelligence Processing Unit (IPU) are part of the major new wave of novel computer architecture for AI, and have a general design with massively parallel computation, distributed on-chip memory and very high inter-core communication bandwidth which allows breakthrough performance for message passing algorithms on arbitrary graphs. We show for the first time that the classical computer vision problem of bundle adjustment (BA) can be solved extremely fast on a graph processor using Gaussian Belief Propagation. Our simple but fully parallel implementation uses the 1216 cores on a single IPU chip to, for instance, solve a real BA problem with 125 keyframes and 1919 points in under 40ms, compared to 1450ms for the Ceres CPU library. Further code optimisation will surely increase this difference on static problems, but we argue that the real promise of graph processing is for flexible in-place optimisation of general, dynamically changing factor graphs representing Spatial AI problems. We give indications of this with experiments showing the ability of GBP to efficiently solve incremental SLAM problems, and deal with robust cost functions and different types of factors

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Monocular, Real-Time Surface Reconstruction using Dynamic Level of Detail

Author: Davison AJ
Leutenegger S
Tsiotsios C
Zienkiewicz J
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/09/2016
Field of study

We present a scalable, real-time capable method for robust surface reconstruction that explicitly handles multiple scales. As a monocular camera browses a scene, our algorithm processes images as they arrive and incrementally builds a detailed surface model. While most of the existing reconstruction approaches rely on volumetric or point-cloud representations of the environment, we perform depth-map and colour fusion directly into a multi-resolution triangular mesh that can be adaptively tessellated using the concept of Dynamic Level of Detail. Our method relies on least-squares optimisation, which enables a probabilistically sound and principled formulation of the fusion algorithm. We demonstrate that our method is capable of obtaining high quality, close-up reconstruction, as well as capturing overall scene geometry, while being memory and computationally efficient

Crossref

Spiral - Imperial College Digital Repository